SAI API Performance Monitoring#2279
Conversation
Signed-off-by: JaiOCP <jai.kumar@broadcom.com>
rck-innovium
left a comment
There was a problem hiding this comment.
While most of the measurements can be done at the application level, this proposal provides a way to measure the metrics per object operation inside bulk APIs which cannot be done by application level performance monitoring.
| * @type sai_uint64_t | ||
| * @flags READ_ONLY | ||
| */ | ||
| SAI_PERFMON_ATTR_PERFDATA, |
There was a problem hiding this comment.
Please specify the units of this data.
| sai_attr_list[2].value.s32 = SAI_PERFMON_METRICS_AVERAGE_LATENCY; | ||
|
|
||
| // Configure Time Interval in msec | ||
| sai_attr_list[3].id = SAI_PERFMON_ATTR_METRICS_TIME_INTERVAL; |
There was a problem hiding this comment.
SAI_PERFMON_ATTR_METRICS_TIME_INTERVAL attribute is missing, and functionality is NOT specified. The spec has defined that the interval is always between the two invocations for a given ObjType+API_type.
|
As discussed, the community concluded that we should not preserve this perfmon data across warmboot (especially since we thought it does not make sense for warm upgrades/ downgrades) |
| * @objects SAI_OBJECT_TYPE_PERFMON | ||
| * @default empty | ||
| */ | ||
| SAI_SWITCH_ATTR_PERFMON_LIST, |
There was a problem hiding this comment.
As discussed on the call, can rely on object-creation to start the collection.
If so, change this to read-only as that allows enumerating the objects, and the metadata will enforce that.
| These metrics can be used to: | ||
| - Improve SAI adapter and SDK implementations | ||
| - Provide a baseline for comparing different hardware | ||
| - Instantaneous value: Provides [time, n], where n > 1 represents the number of objects in a bulk API, or n = 1 represents the last observed latency for a single object |
There was a problem hiding this comment.
I think the 'n' part is not kept any more. Can simplify the description to "last observed latency for API call".
This PR brings in support for measuring SAI API performance. This is based on presentation done in OCP 2023.